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Abstract. A recent body of experimental literature has studied empir- 
ical game-theoretical analysis, in which we have partial knowledge of a 
game, consisting of observations of a subset of the pure-strategy profiles 
and their associated payoffs to players. The aim is to find an exact or 
approximate Nash equilibrium of the game, based on these observations. 
It is usually assumed that the strategy profiles may be chosen in an on- 
line manner by the algorithm. We study a corresponding computational 
learning model, and the query complexity of learning equilibria for var- 
ious classes of games. We give basic results for bimatrix and graphical 
games. Our focus is on symmetric network congestion games. For directed 
acyclic networks, we can learn the cost functions (and hence compute an 
equilibrium) while querying just a small fraction of pure-strategy pro- 
files. For the special case of parallel links, we have the stronger result 
that an equilibrium can be identified while only learning a small fraction 
of the cost values. 



1 Introduction 

Suppose that we have a game G with a known set of players, and known strategy 
sets for each player. We want to design an algorithm to solve G, where the 
algorithm can only obtain information about G via payoff queries. In a payoff 
query, the algorithm proposes pure strategies for the players, and is told the 
resulting payoffs. The general research issue is to identify bounds on the number 
of payoff queries needed to find an equilibrium, subject to the assumption that 
G belongs to some given class of games. 

1.1 Motivation 

Given a game, especially one with many players, it is unreasonable to assume 
that anyone maintains an explicit representation of its payoff function, even if the 
game in question has a concise representation. However, in practice a reasonable 
modelling assumption is that given, say, a strategy profile for the players, we 
can determine their payoffs, or some estimate of the payoffs. We are interested 
in algorithms that find Nash equilibria using a sequence of queries, where a query 
proposes a strategy profile and gets told the payoffs. We would like to know under 
what conditions an algorithm can find a solution based on knowledge of some 
but not all of the game's payoffs, which is particularly important when there 
are many players, and the number of pure-strategy profiles is large. This kind 



of challenge (where you get observations of profile/payoff- vector pairs, and you 
want to find an approximate equilibrium, as opposed to the unobserved payoffs) 
has been the subject of experimental work [31,32,27, 10], where [27] focuses on 
the case (highly relevant to this work) where the algorithm selects a sequence of 
pure profiles and gets told the resulting payoffs. In this paper, we introduce the 
study of payoff- query algorithms from the algorithmic complexity viewpoint. We 
are interested in upper and lower bounds on the query complexity of classes of 
games. 

From the theoretical perspective, we are studying a constrained class of al- 
gorithms for computing equilibria of games. The study of such constraints — 
especially when they lead to lower bounds or impossibility results — informs us 
about the approaches that a successful algorithm needs to apply. In the context of 
equilibrium computation, other kinds of constraint include uncoupled algorithms 
for computing equilibria [23,24], communication-constrained algorithms [22,6, 
21], and oblivious algorithms [8]. Of course, the restriction to polynomial-time 
algorithms is the best-known example of such a constraint. Based on the al- 
gorithms and open problems identified in this paper, we find this to be quite a 
compelling motivation for the further study of the payoff-query model. There are 
various related kinds of query models that are suggested by the payoff queries 
studied here, which may also be of similar theoretical interest; we discuss these 
in Section 6. 

1.2 Games and query models 

In this paper we introduce the study of payofF-qiicrics for strategic-form games. 
We also consider two models of concisely represented games: graphical games 
[28], where players are nodes in a given graph and the payoff of a player only 
depends on the strategies of its neighbors in the graph, and symmetric network 
congestion games [12], where the strategy space of the players corresponds to 
the set of paths that connect two nodes in a network. 

For a strategic-form game, we assume that initially the querying algorithm 
only knows n the number of players and k the number of pure strategies that 
each player has. Generally we seek algorithms that are polynomial in these pa- 
rameters. 

Definition 1. A payoff query to a strategic-form game G selects a pure-strategy 
profile s for G, and is given as response, the payoffs that G's players derive 
from s. 

There arc fc" pure-strategy profiles in a game, and one could learn the game 
exhaustively using this many payoff queries. We are interested in algorithms that 
require only a small fraction of this trivial upper bound. 

For a symmetric network congestion game, we assume that initially the al- 
gorithm only knows the number of players n, and the pure strategy space, given 
by the graph and the common source/destination pair. 



Definition 2. A payoff query to a congestion game selects an assignment of at 
most n players to every pure strategy of the congestion game and learns the costs 
of the pure strategies with the assigned loads. 

This query model removes the restriction that a query corresponds to a strat- 
egy profile, so it is more powerful than the query model for strategic-form games. 
It is a very natural model for congestion games, and we prove both lower and 
upper bounds on the payoff query complexity of congestion games under this 
model. 

Definition 3. The payoff query complexity of a class of games Q, with respect to 
some solution concept such as exact or approximate Nash equilibrium, is defined 

as follows. It is the smallest N such that there is som,e algorithm A that, given 
N payoff queries to any game G € G ( where initially none of the payoffs of G 
are known) can find a solution of G. 

The definition imposes no computational bound on the algorithm A. It is 
to some extent inspired by the work on query-based learning initiated by An- 
gluin [2], in the context of computational learning theory. Note that A may 
select the queries in an on-line manner, so queries can depend on the responses 
to previous queries. 

1.3 Overview^ of results 

We study a variety of different settings. We start by considering bimatrix games, 
and our first result is a lower bound for computing an exact Nash equilibrium: 
computing an exact Nash equilibrium in a fc x fc bimatrix game has payoff query 
complexity k^, even for zero-sum games. In other words, we have to query every 
pure strategy profile. 

We then turn our attention to approximate Nash equilibria, where we obtain 
some more positive results. With the standard assumption that all payoffs lie 
in the range [0, 1], we show that when e lies in the range | < e < 1 — the 
payoff query complexity to find an e-Nash equilibrium lies in [fc, 2k — 1]. When 
e > 1 — ^, no payoff queries are needed at all, because an e-Nash equilibrium 
is always achieved when both players play the uniform distribution over their 
strategies. 

The query complexity of computing an approximate Nash equilibrium when 
e < I appears to be a challenging problem, and we provide an initial lower 
bound in this direction. We show that for a fc x fc bimatrix game, the payoff 
query complexity of finding an e-Nash equilibrium, for e < |, is at least k ■ 
( 32/ iog(fc)-i-64e )' small e this lower bound is fi{k ■ logA;). This gives an 

interesting contrast with the e > ^ case. Whereas wc can always compute a i- 
approximate Nash equilibrium using 0{k) payoff queries, there exists an e < ^ 
for which this is not the case. 

Having studied payoff query complexity in bimatrix games, it is then natural 
to look for improved payoff query complexity results in the context of "struc- 
tured" games. In particular, we are interested in concisely represented games. 



where the payoff query complexity may be much smaller than the number of 
pure strategy profiles. As an initial result in this direction, we consider graphical 
games, where we show that for constant d the payoff query complexity of de- 
gree d graphical games is polynomial. This algorithm works by discovering every 
payoff in the game, howcivcr unlike bimatrix games, this can be done without 
querying every pure strategy profile. 

Finally, we focus on two different models of congestion games. We first con- 
sider parallel links, where the game has a start and end vertex, and m different 
links between them. We show both lower and upper bounds for this model. If n 
denotes the number of players, then we obtain a log(n) payoff query lower bound, 

and a O ^log(n) • iog\og(m) ) P^-yoff query upper bound. Note that there are n-m 
different payoffs in a parallel links game, and so our upper bound implies that 
you do not need to discover the entire payoff function in order to solve a parallel 
links game. 

We also consider the more general case of symmetric network congestion 
games played on directed acyclic graphs. We show that if we have a game with 
m edges and n players, then we can find a Nash equilibrium using m • n pay- 
off queries. This algorithm works by every payoff in the game, but does so by 
querying a small fraction of the pure strategy profiles. 

2 Related work 

In the existing literature, there has so far been mainly experimental work on 
payoff queries. There is also another body of work, both theoretical and exper- 
irncintal. on best- and bcttcir-rcsponsc dynamics, which relevant because these 
dynamics generally work by exploring the space of pure profiles, and receiving 
feedback consisting of payoffs. The difference is that they purport to model a 
decentralised process of selfish bdia\doiir by the players, while the payoff qiicry 
model envisages a centralised algorithm that is less constrained. In this section, 
we give a survey the relevant literature. 

2.1 Payoff queries 

In Empirical game-theoretic analysis [32,26], a game is presented to the ana- 
lyst via a data set of observations of strategy profiles (usually, pure-strategy 
profiles) and their corresponding payoffs. This data set of profiles/payoff- vector 
pairs is called an em,pirical game. In some settings the strategy profiles arc ran- 
domly generated, but it is typically feasible to obtain observations via the payoff 
queries we study here. The profile selection problem [27] refers to the challenge 
of choosing helpful strategy profiles. The strategy exploration problem [26] is a 
special case of the profile selection problem, of finding the best way to limit the 
search to a small subset of a potentially large set of strategies. 

Jordan et al. [27] envisage a setting where a game of interest (called a base 
game) has a corresponding game simulator, an implementation in software, which 
is amenable to payoff queries; a more general scenario allows the observed payoffs 



to be sampled from a distribution associated with the strategy profile, rather 
than being deterministic. The distribution is sometimes considered to be due to 
a noise process, and called the noisy payoff model in [27]. (In this paper we just 
consider deterministic payoffs, the "revealed payoff model" in [27].) As noted in 
Vorobcychik et al. [31], a profile can be repeatedly queried to sample from the 
distribution of payoffs, and thus get an estimate of the expected values. The 
two interacting challenges are to identify helpful queries, and to use them to 
find pure-strategy profiles that have low regret (where regret refers to the largest 
incentive to deviate, amongst the players). The quality of a solution is its regret 
with respect to the base game. A profile s is confirmed if all unilateral deviations 
have been queried, so that its regret is known. Any output solution needs to be 
confirmed in order to have any known bounds on its regret. 

Vorobeychik et al. [31] study an alternative learning objective in which a 
game belongs to a known class, and there is a "regression" challenge in deter- 
mining certain parameters of it; the information about the game consists of a 
random sample of pure profiles and resulting payoff vectors. This is called the 
payoff function approximation task. However, successful learning is measured by 
the extent to which the players' predicted behaviour is close to the behaviour 
associated with the true payoffs, rather than how well the true payoff functions 
are estimated. 

Work on specific classes of multi-player games includes the following. Duong 
et al. [10] studies algorithms for learning graphical games; we consider a graph- 
ical game learning algorithm in Section 4. Jordan et al. [27] apply payoff-query 
learning to various kinds of games generated by GAMUT [29], including a class 
of congestion games (Section 5.2). Vorobeychik et al. [31] investigate a first-price 
auction and also a scheduling game, where payoffs are described via a finite ran- 
dom sample of profile/payoff vector pairs. Earlier, Sureka and Wurman [30] study 
search for pure Nash equilibria of strategic-form games (mostly with 5 players 
and 10 pure strategies). 

Most of the experimental work (e.g. [30,27,10]) uses local search, in which 
profiles that get queried are typically very similar (differing in just one player's 
strategy) from previously queried profiles. Jordan et al. [27] experiment with 
local-search type algorithms in which when a player has the incentive to deviate, 
the tested profile is updated with that deviation. Sureka and Wurman [30] study 
search for pure equilibria via best-response dynamics while maintaining a tabu 
list, introduced to reduce the risk of cycles. In contrast, the algorithms we present 
here for congestion games exploit a non-local search approach, which results in 
improved performance. 

2.2 Best-response dynamics and local search 

There is a large body of literature that studies best- and better-response dynam- 
ics for classes of potential games, and gives bounds on the number of steps re- 
quired for convergence to pure-strategy equilibria. In this literature, the dynam- 
ics are intended to model the behavior of players who make rational responses 
to other players, in a decentralized setting. The dynamics of [11,14,19,5] are 



local search processes in that each pure profile is obtained from the previous 
one by letting a single player move. Bound on the convergence of deterministic 
best-response dynamics were considered in [11, 14]. The better-response dynam- 
ics considered by Goldberg [19] is the basic randomized local search algorithm, 
and bounds are obtained for its convergence to exact equilibrium. Chien and Sin- 
clair [5] study another local search, the e-Nash dynamics, and its convergence to 
approximate equilibria. Other papers, e.g. [15,3] analyse a strongly-distributed 
dynamics in which multiple players can move in the same time step; conse- 
quently the dynamics is not a local search. However, these dynamical systems 
could all be simulated by payoff query algorithms in which at each step, at most 
nk queries are made to determine the change in payoffs available to players as 
a result of unilateral deviations. This paper begins to answer the question: how 
much better could a payoff query algorithm do, if it were not subject to that 
constraint? 

Finally, Alon et al. [1] consider payoff-query algorithms for finding the costs 
of paths in graphs. The consider weight discovery protocols where the aim is to 
determine the costs of edges, and shortest path discovery protocols where the 
aim is to find a shortest path. The latter objective is more similar to what we 
consider, since it can avoid the need to learn the entire payoff function; also a 
shortest path is an equilibrium strategy for the one-player case. 

3 Bimatrix games 

In this section, we give simple bounds on the payoff-query complexity of comput- 
ing approximate Nash equilibria for bimatrix games. We assume that all payoffs 
lie in the range [0, 1], which is a standard when finding approximate Nash equi- 
libria. 

Observation 1 The payoff query complexity of finding a,n exact Nash equilib- 
rium of a k X k bimatrix game is k'^ . This holds even for zero-sum, games. 

To sec this, consider generalised matching pennies, where the column player 
pays 1 to the row player whenever both players choose the same strategy, other- 
wise the row player pays 1 to the column player.^ This game has a unique Nash 
equilibrium, namely when both players randomize uniformly over their strate- 
gies. Now suppose each payoff in the game is perturbed by a small quantity; for 
a zero-sum game the perturbations should preserve the zero-sum property. For 
small perturbations, there will still be a unique fully-mixed solution, but it can 
only be known exactly if all the payoffs are known exactly. Observation 1 raises 
the question of whether better bounds exist for approximate Nash equilibria. We 
have the following result: 

Theorem 4. For any ^ < e < 1 — ^, the payoff query complexity to find an 
e- approximate Nash equilibrium of a 2-player strategic-form k x k game, is in 
[k,2k - 1]. 

^ By rescaling, we can make this game comply with our assumption that payoffs should 
lie in the range [0, 1]. 



Proof. For the upper bound, we can simulate the algorithm of [9] as follows to 
obtain a ^-approximate Nash equilibrium (which is then an e-Nash equilibrium). 
Let Si be an arbitrary pure strategy of the row player. Query all k pure profiles 
where the row player plays si, which allows us to find the column player's best 
response to si, call it S2- Query the additional k — 1 pure profiles where the 
column player plays S2, from which we find the row player's best response to S2; 
call it S3. A ^-approximate Nash equilibrium is obtained by letting the column 
player play pure strategy S2 while the row player randomizes equally between si 
and S3. 

For the lower bound, suppose that the row player has matrix for ^ e [n], 
where pays the row player 1 for playing row ^ and for any other row. 
Then in a e-approximate Nash equilibrium, the row player must play row i with 
probability > p In order to identify ^, we need k queries. □ 

For e > 1 — an e-Nash equilibrium is obtained by letting both players 
play the uniform distribution over their strategies, and no payoff queries are 
needed. The interesting challenge, is to determine the payoff query complexity 
for constant values of e < 5. For e < i we have no non-trivial upper bound on 
the payoff query complexity of finding an e-Nash equilibrium. We next show a 
lower howad (Theorem 8), which is increasing in k and , and for large enough 
fc, e~^, it can be seen to be higher than the upper bound for e > i. Thus, in 
an algorithm-independent sense, there are some positive values of e, for which 
computing an e-Nash equilibrium is harder than it is for other values of e strictly 
less than one. 

Let Q(, be the class of strategic-form games where the column player has £ 
pure strategies and the row player has {/j^} pure strategics (where we assume £ 
is even). Let G(, G Qi be the win- lose constant-sum game in which each row of 
the row player's payoff matrix has | I's and | O's, all rows being distinct. The 
column player's payoffs arc one minus the row player's payoffs. (G^ is similar to 
the class of games used in Theorem 1 of [13].) Note that the value of Gi is ^ 
since either player can obtain payoff | by using the uniform distribution over 
their pure strategies. 

Lemma 5. Suppose that in game G^, the column player uses a mixed strategy 
in which some single pure strategy has probability a > l/£. Then the row player 
can obtain payoff > f + f — ^ • 

Proof. Let j be a column that the column player plays with probability a. Let 
Rj be the set of rows where the row player obtains payoff 1 for column j. Suppose 
the row player plays the uniform distribution over rows in Rj . When the column 
player plays j, the row player receives payoff 1. Let j' 7^ j be a column, and 
consider the payoffs to the row player where j' intersects Rj. A fraction 

of these entries pay the row player 1, while a fraction |^ pay the row player 

0. Consequently whenever the column player plays j' 7^ j, the row player's 
expected payoff is ^^z^ ■ Thus with probability a the row player receives payoff 

1, and with probability 1 — a he receives payoff ^^zrj^, from which the result 



follows, by straightforward manipulations. (In particular, the row player's payoff 
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Corollary 6. Assume a > |, and let e = \{a — j). In any e-Nash equilibrium 
ofGe, the column player plays any individual column with probability at most a. 

Proof. If, alternatively, the column player plays any column j with probability 
> a, then by Lemma 5, the row player can obtain payoff > ^ + f ~ Hence in 
any e-Nash equilibrium, the row player obtains payoff > 5 + f — ^ — e=5+e- 
Hence the column player obtains payoff < ^ — e. Since the value of is ^, the 
column player's regret is > e, thus we do not have an e-Nash equilibrium. 

Lemma 7. For any e < ^, i > 8, the payoff query complexity of finding an 
e-Nash equilibrium for games in Qi is > (^^2) ' ( we+A/e ) • 

Proof. Let A be any payoff query learning algorithm; consider what happens 
when A is run on Gg. Let S be the mixed strategy profile found by A. For 
a = 4e + |, by Corollary 6 no column is played with probability > a, in S. We 
also know that in S, the row player's payoff is < | + e, since S is an e-Nash 
equilibrium of a game with value i. Suppose for a contradiction that A made 
fewer than (^^2) ' ( i6e+4/£ ) P^Y^^ queries. Then some row r received fewer than 
(leTTITl) queries. 

If columns arc selected at random from the cohimn player's distribution in 
S, there is probability at most a{ i^Ji^^jf) = \ that row r contains payoffs that 
have already been queried by A. Now suppose that we modify Gi by replacing 
all un-queried entries of r with payoffs of 1 for the row player. A would output 
the same strategy profile S, but the payoff to the row player for playing r is > |. 
So the row player's regret is > | — (| + e), that is, at least \ — e. This gives us 
a contradiction, for e < g. □ 

Theorem 8. For k x k strategic-form games, the payoff query complexity of 
finding an e-Nash equilibrium, for e < |, is at least k ■ ( 32/iog(fc)-i-64e )" 

Proof. Let k' be the largest number of the form (^^2) that is at most k. We 
have k' > fc/4 and £ > log(fc)/2. By Lemma 7, Ge has query complexity > 
ie%) • ( i6e+A/e ) t° e-Nash equilibrium. That is, k'{jjj^) > jijjj^) 

^ l( 8/iog(fc)+i6J = fe( 32/iog(fc)+64e )- ^ S^me G in Ge can be written down as a 
k X k game, by duplicating rows and columns, and approximate equilibria are 

preserved. □ 

Note that for small e the lower bound given by Theorem 8 is • logfc). 



4 Graphical games 



In this section, we give a simple payoff query-based algorithm for graphical 
games. In a graphical game [28] we assume the players lie at the vertices of a 



degree-d graph, and a player's payoff is a function of the strategies of just himself 
and his neighbors. The number of payoff values needed to specify such a game is 
n • A;''+^ which, in contrast with strategic-form games, is polynomial (assuming 
is a constant). 

Previously, Duong et al. [10] have carried out experimental work on payoff 
queries for graphical games. They compare a number of techniques; the algorithm 
we give here is polynomial-time but would likely be less efficient in practice. 
Similar to [10], wc assume the underlying graph G is unknown, and we want to 
induce the structure of G, and corresponding payoffs. 

Theorem 9. For constant d, the payoff query complexity of degree d graphical 
games is polynomial. 

Algorithm 1 learns the entire payoff function with polynomially many queries. 

For a proof, see Appendix A. 

Algorithm 1 GraphicalGames 
1: Initialize graph G's vertices to be the player set, with no edges 
2: Let S be the set of pure proffles in which at least n — {d+1) players play 1. 
3: Query each element of S. 
4: for all players p, p' do 

5: if 3s, s' E S that differ only in p'a payoff and p"s strategy then 
6: add directed edge {p,p') to graph 

7: end if 
8: end for 

9: for all players p do 
10: Let Np be p's neighborhood in G 

11: Use elements of S to find p's payoffs as a function of strategies of Np 
12: end for 



There are a couple of important caveats regarding Theorem 9. First, al- 
though the payoff query complexity is polynomial, the computational complex- 
ity is (probably) not polynomial, since it is PPAD-complctc to actually compute 
an approximate Nash equilibrium for graphical games [7]. Second, while Algo- 
rithm 1 avoids querying all (exponentially-many) pure-strategy profiles, it works 
in a brute-force manner that learns the entire payoff function. Wc prefer payoff 
query-based algorithms that allow a solution to be found without actually learn- 
ing all the payoffs, as we achieve in Theorem 4, or the main result of Section 5.1. 

5 Congestion games 

In this section, we give bounds on the payoff-query complexity of finding a pure 
Nash equilibrium in symmetric network congestion games. A congestion game 
is defined by a tuple F = {N, E, {Si)i^N, {.fe)eeE)- Here, = {1, 2, . . . , n} is a 
set of n players and E is a. set of resources. Each player chooses as her strategy 
a set Si C E from a given set of available strategies Si C 2^. Associated with 



each resource e G i? is a non-negative, non-decreasing function /e : N M+. 
These functions describe costs to be charged to the players for using resource e. 
An outcome (or strategy profile) is a choice of strategies s = {si,S2, ■■■,Sn) by 
players with Si £ Si. For an outcome s define ne{s) = \i G A'' : e e Sj| as 
the number of players that use resource e. The cost for player i is defined by 
= X^eGs /e('^e(s)). A pure Nttsh equilibrium is an outcome s where no player 
has an incentive to deviate from her current strategy. Formally, s is a pure Nash 
equilibrium if for each player i G N and s[ £ Si, which is an alternative strategy 
for player i, we have Ci(s) < Ci(s_i,s^). Here (s_j,s-) denotes the outcome that 
results when player i changes her strategy in s from Sj to s-. 

In a network congestion game, resources correspond to the edges in a di- 
rected multigraph G = {V,E). Each player i is assigned an origin node Oj, and 
a destination node dj. A strategy for player i consists of a sequence of edges 
that form a directed path from Oj to di, and the strategy set Si consists of 
all such paths. In a symmetric network congestion game all players have the 
same origin and destination nodes. We write a symmetric network congestion 
game as F = {N, V, E, {fe)eeE, o, d). Note that V, E, a, and d collectively define 
the strategy space {Si)i^N- We consider two types of network, directed acyclic 
graphs, and the special case of parallel links. We assume that initially we only 
know the number of players n and the strategy space. The latency functions 
are completely unknown initially. A payoff query presents an assignment of at 
most n players to every strategy of the congestion game and learns the costs of 
strategies with the assigned loads. As a shorthand for defining strategy profiles, 
we use notation of the form s (1 i— > p, 3 i~> q). This example defines s to be a 
four-player strategy profile that assigns 1 player to p and 3 players to q, where 
p and q will be paths from source to sink in a symmetric network congestion 
game. We use Query(s) to denote a query with an assignment s of at most n 
players to the pure strategies. It returns a function Cs, which gives the cost of 
each strategy when s is played. 

5.1 Pgirallel links 

In this section, we consider congestion games on m parallel links. A simple 
construction with two parallel links shows a lower bound of log n. The essence 
of the construction is as follows. Link 1 has a constant cost of 1 for any number 
of players. Link 2 has a cost of which switches at some point to 2. To find a 
pure equilibrium a querier must find this switch, and an adversary has a simple 
strategy to ensure that the querier can do no better than binary search. The 
lower bound is presented in full detail in Appendix B. In the rest of the section, 
we provide a payoff query algorithm that finds a pure Nash equilibrium using 

Our algorithm works in phases. In each phase we move players only in some 
multiple of (5 = fc*, for some integer parameter k, and we determine an equilib- 
rium with respect to this group size 5. To find out how many groups to move 
between the links we perform a double binary search: In the first binary search. 



we guess the number of moved groups, which will give us a target cost value. We 
use this target value in a second binary search to check whether our guess on the 
number of groups to move was correct. In each subsequent iteration S becomes 
smaller by a factor k until in the last iteration 5=1 and a Nash equilibrium is 
found. 

The algorithm (ParallelLinks) is depicted in Algorithm 2. We will show 
how this algorithm can be implemented with the stated number of queries. The 
algorithm is inspired by an algorithm from [17]. 

Throughout the algorithm there will be special link a. For a congestion game 
r, an integer 5, and a special link a we define a (^-equilibrium as follows: 

Definition 10 ((5-equiIibrium). A strategy profile s is 6 -equilibrium if 6\ni{s) 
for all i e [rn\\{a}, and for all links i,j G [m] with ni{s) > 6 we have /i(ni(s)) < 
fM{s) + 6). 

Intuitively, we can think of a (5-equilibrium s as a Nash equilibrium in a trans- 
formed game where the players (of the original game) are partitioned into groups 
of size 6 and each group represents a player in the transformed game, and the 
remaining (n mod S) players are fixed to link a. 

We start with an informal description of algorithm ParallelLinks. The al- 
gorithm is parameterised by an integer A; > 2. It starts by finding a n-equilibrium 
s by putting all players together on the link a that minimises the induced cost. 
Observe that by Definition 10, s is also a ^-equilibrium for any 5 > n. We can 
find a with a single payoff query, by querying for n players on each link. Note, 
that throughout the algorithm, link a will take a special role since the number of 
players assigned to any other link will always be a multiple of S. The algorithm 
then works in T -|- 1 phases, where T = |_ j J • Each phase is one iteration of 
the for-loop. The for-loop is governed by a variable t, which is initially T and 
decreases by 1 in each iteration. Within any iteration, the algorithm uses the 
function RefineProfile to transform a fc*+^-equilibrium into a fc*-equilibrium. 
We can bound the number of reassignments of groups of players to different links 
needed to achieve this by the following lemma: 

Lemma 11. We can convert a k*^^ -equilibrium s into a k* -equilibrium s' by 
moving at most 2k groups of 6 = k* players to any individual link and at most 
km groups of 5 players in total. 

Proof. Since s is fc*+-'^-equilibrium, we have fi{ni{s)) < fj{nj{s) + for all 

i £ [m] \ {a},j € [m]. Moreover, either (a) fa{na{sj) < fj{nj{s) + fc*+^) for all 
j e [m] or (b) na{s) < In case (a), this implies that each link j £ [m] can 

in total receive at most k groups of size S = k^ from links i £ [m] . In case (b) , 
this implies that each link j £ [m] can in total receive at most k groups of size 
S = k* from links i £ [ni] \ {a}. Moreover, since na(s') < we can move at 

most k groiips of size 5 — k^ from link a. In any case, in total we move at most 
km groups. All links receive and loose players only in multiples of (5 = A;* which 
ensures that fc*|ni(s') for all i £ [m] \ {a} is maintained. 



Algorithm 2 ParallelLinks 



2: initialize strategy profile s by putting all players on link a 

4: for i = T,T- 1,..., 1,0 do 
5: 5 

6: s RefineProfile(s, 5, 0, km) 
7: end for 
8: return s 

9: function RefineProfile(s, 5, gmi„, gmoa:) 
10: q ^ J 
11: Pairallel for all links i € [m] 

12: Query for costs fi{ni{s) + r(5) for all integer 1 < r < 2fc > 2fc queries 

13: EndParallel 

14: Q the ordered multiset of 2km non-decreasing costs from the above 
queries 

15: Cmin{q) {q + l)-th smallest clement of Q 

16: Pi ■(— number of times i £ [m] contributes a cost to the g smallest elements 

of g 

17: Parallel for all links i € [m] 

18: if /i(nj(s) — [^^^J • S) > Cmin{q) then i> 1 query; only relevant for 

link a 

19: qi ^ L^J 

20: else (using binary search on qi G [0,min{fcTO, [ "'j^-* J}]) 

21: qi ■{- min {q^ : fi{ni{s) - qi6) < Cmin{q)} > \og{km) queries 

22: end if 

23: EndParallel 

24: if J2ieln] <li = (l then 

25: modify s by removing qi and adding pi groups of 6 players to every 

link i G [m] 

26: return s 

27: else if li < 1 ^^en 

28: return RefineProfile(s, 5, qmin, 9-1) 

29: else {J2^e[n] 1^ > l) 

30: return RefineProfile(s, 6,q+l, qmax) 

31: end if 

32: end function 



Refining the profile. Our algorithm uses the function RefineProfile to imple- 
ment the refinement as described in Lemma 11. RefineProfile determines the 
number of groups q which have to be moved by binary search on q in [0, km]. 
Since by Lemma 11 each link receives at most 2k groups of players, we spend 
2k payoff queries to determine for all links i E [m], the cost function values 
/j(nj(s) + r ■ 5) for all integers r < 2k. Wc define Q as the multi-set of these 
cost function values and CminiQ) ^ the {q + 1) smallest value in Q. Intuitively, 
Cminil) is the cost of the (g -|- l)-th group of players that we would move. We 
use CminiQ) to find out how many groups of players qi we need to remove from 
each link i e [m] so that on each link i s [m] the cost is at most Cmin{q) or we 
can't remove any further groups as there are less than 1) players assigned to it 
(which can only happen on link a). By Lemma 11, we need to remove at most km 
groups of players in total. Therefore, we can determine qi G [0, minjfcm, L^^^^^J }] 
by binary search in parallel on all links, with 0{\og{km)) payoff queries. Now, 
if 1i ~ 9' "^^ii construct a fc*-equilibrium by removing qi and adding pi 

groups of 6 players to link i € [m] ; note that for every i S [m] , either gj = or 
Pi = 0. If X]"=i Qi 9) our guess for q was not correct and we have to continue 
the binary search on q. 

The algorithm maintains the following invariant: 

Lemma 12. RefineProfile(s, 5, 0, /cm) returns a S- equilibrium. 

Proof. Observe that S = k*. In the first iteration of the for-loop t = T and 
RefineProfile(s, S. 0, km.) gets a n-equilibrium as input, which is also a fc-^+^- 
equilibrium as all players arc assigned to link a and k^^^ > n. So to proof 
the claim, it suffices to show that RefineProfile(s, fc*, 0, fcm) returns a fc*- 
equilibrium if s is a fc*+^-cquilibrium. For the s returned by RefineProfile 
and the q in its returning call, we have fi{ni(s)) < Cminiq) ^ fi{ni{s) +5) for all 
i S [m] \ {a}. The left inequality follows from line 21 of the algorithm. The right 
inequality follows from the definition of Cminiq) ^.s the ((7+l)-th smallest element 
in Q in line 15 of the algorithm. For link a, we have fa{nais)) < Cmin{q) < 
fa{na{s) + 5) or we have fa{na{s)) > Cmin{q) and no(s) < S, where the first 
case follows from lines 21 and 15 as before, and the second case corresponds to 
line 18. Noting that RefineProfile maintains that for the returned s we have 
S\ni{s) for all i e [m] \ {a}, as it only moves groups of size S, the claim follows. 

Lemma 13. RefineProfile(s, (5, 0, /cm) makes 0{log{km){k + \og{km))) pay- 
off queries. 

Proof. For each value of q in the binary search, we make 0{k) payoff queries 
to determine Cminiq) ^^'^ 0{\og{km)) payoff queries to determine the qi's in 
parallel for all links i € [m\. The binary search on q adds a factor log(A;m). 

Theorem 14. Algorithm ParallelLinks returns a pure Nash equilibrium and 




Proof. In the last iteration of the for-loop, we have 5 = 1, so Lemma 12 implies 
that s is a pure Nash equilibrium. To find the best link in line 1 of the algorithm, 
wc need one payoff query. For any k > 2, the algorithm does T +1 = O ^ io|(fc) ^ 
iterations of the for-loop. In each iteration we do 0(log(fcm)(fc-|-log(fcm))) payoff 
queries. Choosing k = 6'(log(m)) yields the stated upper bound. □ 



5.2 Symmetric Network Congestion Games on DAGs 

In this section, we consider symmetric network congestion games on DAGs. 

Throughout this section, we consider the game F = {N.V. E,{fe)e^E,o,d), 
where {V,E) is a DAG. We use the ~< relation to denote a topological order- 
ing over the vertices in V. We assume that, for every vertex v gV, there exists a 
path from o to u, and there exists a path from v to d. If either of these conditions 
does not hold for some vertex v, then v cannot appear on an o-d path, and so it 
is safe to delete v. 



Preprocessing. Before we present the algorithm, we describe a required prepro- 
cessing step. We say that edges e and e' are dependent if visiting one implies 
that we must visit the other. More formally, e and e' are dependent if, for every 
o-d path p, we either have e, e' G p, or we have e, e' ^ p. We preprocess the game 
to ensure that there are no pairs of dependent edges. To do this, we check every 
pair of edges e and e', and test whether they are dependent. If they are, then 
we contract e', i.e., if e' = {v, u), then we delete e', and set v = u. The following 
lemma shows that this preprocessing is valid. 

Lemma 15. There is an algorithm that, given a congestion game F, where 
{V,E) is a DAG, produces a game F' with no pair of dependent edges, such 
that every Nash equilibrium of F' can be converted to a Nash equilibrium of F. 
The algorithm and conversion of equilibria take polynomial time and make zero 
payoff queries. 

Thus, we assume that our congestion game contains no pair of dependent edges. 

Equivalent cost functions. Our approach is to devise a querying strategy to 
determine the cost function of each edge. That is, for each e G E, we give an 
algorithm to discover fe{i) for all i. One immediate observation is that we can 
never hope to find the actual cost functions. Consider the following one-player 
congestion game. 




If we set /a(l) = /b(l) = 1 and /c(l) = fd{^) = 0, then all o-d paths have cost 1. 
However, we could also achieve the same property by setting /a(l) = /^(l) = 
and setting /c(l) = = 1- Thus, it is impossible to learn the actual cost 



functions using payoff queries. Fortunately, we do not need to learn the actual 
cost function in order to solve the congestion game. We define two cost functions 
to be equivalent if they assign the same cost to every strategy profile. 

Definition 16 (Equivalence). Two cost functions f and f are equivalent 
if for every strategy profile s = {si, S2, ■ ■ ■ , s„), we have Z^eesi /e('^e(s)) = 
Eee., f'e{ni{s)), for alii. 

Clearly, the Nash equilibria of a game cannot change if we replace its cost func- 
tion / with an equivalent cost function /'. We give an algorithm that constructs 
a cost function /' that is equivalent to /. We say that {f',)eeE is a partial cost 
function if for some e € E and some i < n, fe{i) is undefined. Wc say that /" is 
an extension of /' if /" is a partial cost function, and if /"(«) = /e(i) for every 
e G E and i < n for which f'^{i) is defined. We say that /" is a total extension of 
/' if /" is an extension of /', and if f"{i) is defined for all e £ E and all i <n. 

Definition 17 (Partial equivalent cost function). Let f be a cost function. 
We say that f is a partial equivalent of f if f is a partial cost function, and if 
there exists a total extension f" of f such that f" is equivalent to f. 

Our algorithm will begin with a partial cost function /" such that /g (?) is un- 
defined for all e e and all i < n. Clearly /° is a partial equivalent of /. It 
then constructs a sequence of partial cost functions . . . , where each 

is an extension of /°~^, and each /" is a partial equivalent of /. Therefore, the 
algorithm will eventually arrive at a total cost function /" that is equivalent to 
the original cost function /. 

The algorithm proceeds inductively. It begins by constructing a partial equiv- 
alent cost function such that (1) is defined for every edge e e E. Then, 
in each subsequent step, it takes a partial equivalent cost function such that 
/e (i) is defined whenever j < i, and it constructs a partial equivalent cost 
function /" , where (j) is defined whenever j < i + 1. 

5.3 Symmetric Network Congestion Games on DAGs: The 
one-player case 

Wc begin by describing a method that computes a partial equivalent cost func- 
tion /" such that /e (1) is defined for every edge e G E. The algorithm begins 
with the partial cost function /°. The algorithm processes vertices according to 
the topological ordering ^. When the algorithm processes a vertex k E V, it 
begins with a partial equivalent cost function /° such that (1) is defined for 
every edge e = {v,u) with u -< k, for some vertex k. It then produces a partial 
equivalent cost function f""^^ such that (1) is defined for every edge e = (v, u) 
with u ^ k. There are two cases to consider, depending on whether k = d ov 
not. 



Algorithm 3 ProcessK 

Input: A partial equivalent cost function such that /"(I) is defined for all 

edges {v,u) with u ^ k. 
Output: A partial equivalent cost function f"'^^, such that (1) is defined for 
all edges {v, u) with u ^ k. 
1: for all e for which (1) is defined do 

3: end for 

4: p <— an arbitrary k-d path 

5: for all e = {v,k) e E do 
6: p' an arbitrary o-v path 
7: s (1 1-^ p'ep) 
8: Cs Query(s) 

9: t(ep)^Cs(p'ep)-Ee'6p'/e'(l) 

10: end for 

11: m edge e = (w, fc) that minimises t{ep) 
12: /;^+Hl)^0 

13: for all e = {v,k) € E with e 7^ m do 
14: /e+^(l) ^ i(ep) - t{mp) 
15: end for 

r/ie k ^ d case. We use the procedure shown in Algorithm 3 to process k. 

Lines 1 through 3 simply copy the old cost function /° into the new cost function 
jo+i^ This ensures that /"^^ is an extension of /°. The algorithm then picks 
an arbitrary k-d path p. The loop on lines 5 through 10 compute the function 
t, which for each incoming edge (e, fc), gives the cost of allocating one player to 
ep. Note, in particular, that the value of the expression J2e'ep' fe'i^) known 
to the algorithm, because every vertex visited by p' has already been processed. 
The algorithm then selects m to be the edge that minimises t, and sets the cost 
of m to be 0. Once it has done this, lines 13 through 15 compute the costs of 
the other edges relative to m. 

When we set the cost of m to be 0, we arc making use of equivalence. Suppose 
that the actual cost of m is c^. Setting the cost of m to be has the following 
effects: 

— Every incoming edge at k has its cost reduced by Cm- 

— Every outgoing edge at k has its cost increased by Cm- 

This maintains equivalence with the original cost function, because for every path 
p that passes through k, the total cost of p remains unchanged. The following 
lemma formalises this and proves that /""'"^ is indeed a partial equivalent cost 
function. 

Lemma 18. Let k ^ d be a vertex, and let /" be a partial equivalent cost func- 
tion such that /e(l) is defined for all edges e ~ {v,u) with u ^ k. When given 
these inputs, Algorithm 3 computes a partial equivalent cost function f"'~^^ such 
that /g+^(l) is defined for all edges e = {v,u) with u ^ k. 



The k = d case. When the algorithm processes d, it will have a partial cost 
function /° such that /"(I) is defined for every edge e = {v,u) with u ^ d. The 
algorithm is required to produce a partial cost function /'*"'" ^ such that /e (1) is 
defined for all e € E. We use Algorithm 4 to do this. Lines 1 through 3 ensure 

Algorithm 4 ProcessD 

Input: A partial equivalent cost function such that /"(I) is defined for all 

edges e = (u, u) with u ~< d. 
Output: A partial equivalent cost function f"'^^, such that /g (1) is defined for 
all edges e G E. 
1: for all e for which /"(I) is defined do 

3: end for 

4: for all e = {v,d) G E do 
5: p ^ an arbitrary o-i; path 
6: s (1 !-)■ pe) 
7: Cs ^ Query(s) 

8: /e''+^(l)^Cs(?>e)-Ee'ep/e"(l) 

9: end for 



that /""'"^ is equivalent to Then, the algorithm loops through each incoming 
edge e = {v,d), and line 8 computes Note, in particular, that f",(l) 

is defined for every edge e' e p, and thus the computation on line 8 can be 
performed. Lemma 19 shows that Algorithm 4 is correct. 

Lemma 19. Let k ^ d be a vertex, and let /" be a partial equivalent cost func- 
tion defined for all edges {v, u) with u -< d. When given these inputs, Algorithm 4 
computes a partial equivalent cost function /""'"^ . 

Query complexity. The algorithm makes exactly \E\ payoff queries in order to 
find the one-player costs. When Algorithm 3 processes a vertex fc, it makes 
exactly one query for each incoming edge {v,k) at k. The same property holds 
for Algorithm 4. This implies that, in total, the algorithm makes \E\ queries. 

5.4 Symmetric Network Congestion Games on DAGs: Many-player 
games 

We now assume that we have a partial equivalent cost function /° such that 
feiJ) is defined whenever j <i. We give an algorithm to produce a partial cost 
function f" , such that /" (i) is defined whenever j < i + 1. 

We will proceed as in the one-player case, by processing vertices according to 
their topological order. The algorithm is complicated by bridges. An edge e is a 
bridge between two vertices v and u, if every v-u path contains e. Furthermore, 
if we fix a vertex k G V, then we say that an edge e is a fc-bridgc if e is a bridge 
between k-d. The following lemma can be proved using the max-flow min-cut 
theorem. 



Lemma 20. Let v and u he two vertices. There are two edge disjoint paths 
between v and u if, and only if, there is no bridge between v and u. 



Bridges. Given a vertex k, we show how to determine the cost of the fc-bridges. 
Let bi, b2, . ■ ■ , bm denote the hst of fc-bridges sorted according to the topological 
ordering That is, if hi = (vi,ui), and 62 = {v2,U2), then wo have vi ^ V2, 
and so on. Our algorithm is given a partial cost function such that f^{j) is 
defined for all j < i, and returns a cost function /""'"^ that is an extension of /" 
where, for all I, we have that f^~^^{i + 1) is defined. 

Our algorithm processes the fc-bridges in reverse topological order, starting 
with the final bridge bm- Suppose that we are processing the bridge bj — {v,u). 
We will make one payofi^ query to find the cost of bj, which is described by the 
following diagram. 



The dashed lines in the diagram represent paths. They must satisfy some special 
requirements, which we now describe. The paths p4 and must be edge disjoint, 
apart from fc-bridges. The following lemma shows that we can always select two 
such paths. 

Lemma 21. For each k-bridge bj = {v,u), there exists two paths p4 andp^ from 
u to d such that pif\p^ = 6j+2, . . .bm}- 

On the other hand, the paths p\, p2, and ps must satisfy a diff'erent set of 
constraints, which are formalised by the following lemma. 

Lemma 22. Let bj = {v, u) be a k-bridge, let p2 be an arbitrarily chosen o-k 

path. There exists an o-k path pi and a k-v path p3 such that: pi and p3 are edge 
disjoint; and if pi visits k, then p2 and pi use different incoming edges for k. 

Algorithm 5 shows how the cost of placing i + 1 players on each of the fc- 
bridges can be discovered. Note that on line 9, since s assigns one player to p\, 
we have ne(s) = 1 for every e e pi. Therefore, /g+^(ne(s)) is known for every 
edge e e pi. Moreover, for every edge e e pi, we have that ne(s) = i + 1 if e 
is a fc-bridge, and we have ne(s) = 1, otherwise. Since the algorithm processes 
the fc-bridges in reverse order, we have that /g+^(ne(s)) is defined for every edge 
e Gp4. The following lemma shows that line 9 correctly computes the cost of bj. 

Lemma 23. Let k be a vertex, and let f" be a partial equivalent cost function, 

such that fp{j) is defined for every j < i. Algorithm 5 computes a partial equiva- 
lent cost function f""^^, such that /""''^ is an extension of f"" , and fg~^^ is defined 
for every e that is a k-bridge. 





Algorithm 5 FindKBridges(A;) 

Input: A vertex k, and a partial equivalent cost function f", such that f"{j) is 

defined for every j < i. 
Output: A partial equivalent cost function f"'^^, such that Z""*"^ is an extension 

of and /g "'"^ is defined for every e that is a bridge. 
1: for all e and j for which /" 0) is defined do 

2: /e"+'(i)^/e(i) 

3: end for 

4: for j = m to 1 do 

5: Pi, P5 ^ paths chosen according to Lemma 21 
6: Pi, P2, P3 paths chosen according to Lemma 22 

7: S ^ (1 1-^ PlbjP4, i ^ P2P3bjP5) 

8: Cs <r- Query(s) 

9: f^+\i + 1) ^ Cs(piM4) - Eeep. /e^+H^eCs)) - EeGp. /e^+H^els)) 

10: end for 

Incoming edges of k. We now describe the second part of the many-player case. 
After finding the cost of each fc-bridge, we find the cost of each incoming edge 
at k. The following diagram describes how we find the cost of e = {v,k), an 
incoming edge at k . 



pi 




P2 



The path p is an arbitrarily chosen path from a to v. The paths pi and P2 are 
chosen according to the following lemma. 

Lemma 24. There exist two k-d paths pi,P2 such that every edge in pi np2 is 
a k-bridge. 

Algorithm 6 shows how we find the cost of putting i + 1 players on each 
edge e that is incoming at k. Consider line 9. Note that every vertex in p is 
processed before k is processed, and therefore /"/"""^(i + 1) is known for every 
e' G p. Moreover, for every edge e' £ pi, we have that ne'(s) = i + 1 if e' is a 
/c-bridge, and we have ne'(s) = 1 otherwise. In either case, the /g,^^(ne' (s)) is 
known for every edge e' € pi- The following lemma show that line 9 correctly 
computes ,f^^'^{i + !)• 

Lemma 25. Let k be a vertex, and let /" be a partial equivalent cost function, 
such that fg{j) is defined for all e & E when j < i, all e = {v,u) with u -< k 
when j = i + 1, and all k-bridges when j = i + 1. Algorithm 6 produces a partial 
equivalent cost function such that f^{j) is defined for all e £ E when j < i, 
and for all e = {v, u) with u <k when j = i + 

Query complexity. We argue that the algorithm can be implemented so that the 
costs for {i + 1) players can be discovered using at most \E\ many payoff queries. 



Algorithm 6 MultiProcessK 

Input: A vertex k, and a partial equivalent cost function such that f"{j) is 
defined for all e Q E when j < i, all e = {v, u) with u ~< k when j = i + 1, 
and all /c-bridges when j = i + 1. 

Output: A partial equivalent cost function such that /" (i) is defined for 

all e € E when j < i, and for all e = {v, u) with u <k when j = i + 
1: for all e and j for which f^{j) is defined do 

2: /e"+'(j)^/e"(j) 

3: end for 

4: for all e = {v,k) & E do 

5: p an arbitrary o-w path 

6: pi , p2 paths chosen according to Lemma 24 

7: s <— (1 ^ pepi, i ^ peps) 

8: Cs <— Query(s) 

9: + 1) ^ cs(pepi) - Ee'ep /:'+'K'(s)) - Ee'ep. /:'+'K'(s)). 

10: end for 

Every time Algorithm 5 discovers the cost of placing i + 1 players on a fc-bridge, 
it makes exactly one payoff query. Every time Algorithm 6 discovers the cost of 
an incoming edge (v, k), it makes exactly one payoff query. The key observation 
is that the costs discovered by Algorithm 5 do not need to be rediscovered by Al- 
gorithm 6. That is, we can modify Algorithm 6 so that it ignores every incoming 
edge {v, k) that has already been processed by Algorithm 5. This modification 
ensures that the algorithm uses precisely \E\ payoff queries to discover the edge 
costs for i + 1 players. 

6 Conclusions and further work 

We first consider open questions in the setting of payoff queries, which has been 
the main setting for the results in this paper. We then consider alternative query 
models. 

Open questions concerning payoff queries. For strategic-form games, various 
questions have been raised by our results. Theorem 8 gives a lower bound on 
the payoff query complexity of computing an e-Nash equilibrium of an fc x fc 
bimatrix game in terms of k, for small positive e; what about for larger e < ^? 
What about a non-trivial upper bound for e < |? Also, what is the exact ex- 
pression for the payoff query complexity of bimatrix games when e > i? It 
would also be interesting to investigate the payoff query complexity of finding 
an e-Nash equilibrium in an n-player strategic- form game for small n > 2. How- 
ever, for polynomial-time algorithms only a weak upper bound of e < 1 — — is 
known [4,25]. To achieve a better e with only polynomially-many queries, ei- 
ther non-polynomial-time algorithms would need to be used, or the 1 — ^ upper 
bound would need to be improved. 

For congestion games, our lower bound of logn arises from a game with two 
parallel links. The upper bound is only a poly-logarithmic factor off from this 



lower bound, with the factor depending on m and not n. Thus a lower bound 
construction that also depends on m would be interesting. For DAGs it is unclear 
whether the payoff query complexity is sub-linear in n. Non-trivial lower and 
upper bounds for more general settings, such as asymmetric network congestion 
games (DAG or not) or general (non-network) congestion games would also be 
interesting. 

Other query m,odels. Wc have defined a payoff query as given by a pure (not 
mixed) profile s, since that is of main relevance to empirical game-theoretic 
modelling. Furthermore, if s was a mixed profile, it could be simulated by sam- 
pling a number of pure profiles from s and making the corresponding sequence 
of pure payoff queries. An alternative definition might require a payoff query to 
just report a single specified player's payoff, but that would change the query 
complexity by a factor at most n. 

Our main results have related to exact payoff queries, though other query 
models are interesting too. A very natural type of query is a best-response query, 
where a strategy s is chosen, and the algorithm is told the players' best responses 
to s. In general s may have to be a mixed strategy; it is not hard to check that 
pure-strategy best response queries are insufficient; even for a two-player two- 
action game, knowledge of the best responses to pure profiles is not sufficient to 
identify an e-Nash equilibrium for e < 5. Fictitious Play ([16], Chapter 2) can be 
regarded as a query protocol that uses best-response queries (to mixed strate- 
gies) to find a Nash equilibrium in zero-sum games, and essentially a 1/2-Nash 
equilibrium in general-sum games [18]. We can always synthesize a pure best- 
response query with n{k — 1) payoff queries. Hence, for questions of polynomial 
query complexity, payoff queries are at least as powerful as best-response queries. 
Are there games where best-response queries are much more useful than payoff 
queries? If k is large then it is expensive to synthesize best-response queries with 
payoff queries. The DMP-algorithm [9] finds a i-Nash equilibrium via only two 
best-response queries, whereas Theorem 4 notes that 0{k) payoff queries are 
needed. 

A noisy payoff query outputs an observation of a random variable taking 
values in [0, 1] whose expected value is the true payoff. Alternative versions 
might assume that the observed payoff is within some distance e from the true 
payoff. Noisy query models might be more realistic, and they are suggested by by 
the experimental papers on querying games. However in a theoretical context, 
one could obtain good approximations of the expected payoffs for a profile s, 
by repeated sampling. It would interesting to understand the power of different 
query models. 
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A Proof of Theorem 9 



Proof. Algorithm 1 constructs a directed graph G for the (initally unknown) 
games, along with the payoff function. G is the "affects graph" [20] in which 
a directed edge {p',p) has the meaning that the behaviour of p' may affect p's 
payoff. Note that in Step 2, \S\ < {n ■ k)'^^^. In a degree-d graphical game, any 
player p's payoffs may be affected by his own strategy, and the strategies of at 
most d neighbours p' for which edges {p' ,p) exist. The existence of edge {p' ,p) 
is equivalent to the existence of strategy profiles s, s' that differ only in p"s 
strategy and p's payoff. This is what Algorithm 1 checks for. Finally, when the 
edges, and hence neighborhoods of the graph game have been found, it is simple 
to read off each player's payoff matrix from the data in Step 3. □ 

B A log n lower bound for parallel links 

The following construction shows an log n for both payoff queries and n-player 
payoff queries. We fix a graph G with two parallel links ei and 62, and we fix 
the cost of e2 so that /e^ [i) = 1 for all i <E N. We fix the format of so that 
it may only return or 2. Since /e^ is non-decreasing, this implies that it will 
be a step function with a single step. We say that the step is at location i G N 
if feiij) = for all j < i, and /ei(j) — 2 for all j > i. The precise location 
of the step will be decided by an adversary, in response to the queries that are 
received. 

The adversary's strategy maintains two integers I and u with / < u, and 
initially the adversary sets I = and u = n. Intuitively, for all values below I the 
adversary has fixed fe^ to 0, and for all values above u the adversary has fixed 
/ei to 2. The range of values between u and I are yet to be fixed, and all values 
in this range could potentially be the location of the step. 

Suppose that the adversary receives the query s. The adversary will respond 
with a pair (ci,C2), where ci is the cost of ei, and C2 is the cost of 62. The 
adversary uses the following strategy: 

— If riej (s) < /, then the adversary responds with (0, 1). If rigj (s) > u, then the 
adversary responds with (2, 1). 

— If nei(s) < that is, if nei(s) is closer to I than it is to u, then the 
adversary sets I = (s), and responds with (0, 1). 

— If nej(s) > that is, if ne^{s) is closer to u than it is to /, then the 
adversary sets u ~ riej(s), and responds with (2, 1). 

Note that, if there exists an i with I < i < u, then the querier cannot 
correctly determine the Nash equilibrium. This is because the step could be at 
location i, or it could be at location i — 1. In the former case, the unique Nash 
equilibrium assigns i players to ei and n — i players to 62, and in the latter case 
the unique Nash equilibrium assigns i — 1 players ei and n — i + 1 players to e^- By 
construction, the adversary's strategy ensures that, in response to each query, 
the gap between u and / may decrease by at most one half. Thus, the querier 
must make i7(logn) queries to correctly determine the Nash equilibrium. 



C Proof of Lemma 15 



Proof. Our algorithm will check, for each pair of edges e = {v, u) and e' = [v' , u'), 
whether e and e' are dependent. This is done in the following way. Note that 
if II = v', then e and e' cannot possibly be dependent. Thus, we can assume 
without loss of generality that v -<v'. The algorithm performs two checks: 

— Delete e and verify that there is no path from o to w'. 

— Delete e' and verify that there is no path from u to d. 

The first check ensures that every path that uses e' must also use e. The second 
check ensures that every path that uses e must also use e'. Thus, if both checks 
are satisfied, then e and e' are dependent. On the other hand, if one of the checks 
is not satisfied, then we can construct an o-d path that uses e and not e', or a 
path that uses e' and not e, which verifies that e and e' are not dependent. 

Whenever the algorithm finds a pair of edges e, e' G E that arc dependent, 
it contracts e'. More formally, if e' = {v,u), then the algorithm constructs a 
new congestion game F' = {N, V, E', {fg)eeE' , o, d) where V = V\ {u}, and E' 
contains: 

— every edge (w, x) €z E with and w ^ u, and 

— an edge (w, x) for every edge x) S E. 

Note that E' does not contain e'. Moreover, wc define the cost functions /' as 
follows. For each edge e" 7^ e, we set /g//(i) = /e"('0 for all i. For the edge e, we 
define /^(i) = /e(i) + /e'(z) for all i. 

We argue that this operation is correct. Since e and e' are dependent, we 
have that, for every strategy profile s, for every o-d path p: 

e" e" 

Therefore, we can easily translate each Nash equilibrium of F' into a Nash 
equilibrium for F. 

Thus, the algorithm constructs a sequence of games Fi, F2, . . . , where each 
game il+i is obtained by contracting an edge in Fi. Moreover, the Nash equilibria 
for Fi^i can be translated to F^, which implies that the algorithm is correct. 

This algorithm can obviously be implemented in polynomial time. Moreover, 
since the algorithm only inspects structural properties of the graph, it does not 
make any payoff queries. 

D Proof of Lemma 18 

Proof. It can be verified that the algorithm assigns a cost to /""""^(l) for every 
edge e = {v,u) with li ^ fc. To complete the proof of the lemma, we must show 
that /°+^ is a partial equivalent cost function. Since /° is a partial equivalent 
cost function, there must exist a total extension of /" that is equivalent to /. 



Let /' denote this extension. We will use /' to construct /", which is a total 
extension of /""'"^ that is equivalent to /. 

Let e = (v, k) be an incoming edge at k. We begin by deriving a formula for 
t{ep), which is computed on line 9. Note that, since /' is equivalent to /, we have 
Cs(p'ep) = Ee'ep'ep/e'(l)- Notc also that /^,(1) = /«,(!) for every edge e' e p' . 
Therefore, we have the following: 

t{ep)=c.{p'ep)-Y, rAl) 

= E /e'(l)-E/e'(l) 

= E /e'(l)- 

e'Eep 

For each edge e = {v, k) with e 7^ m, hne 14 sets: 
r,+\l)=t{ep)-t{mp) 

= E/e'(l)- E /e'(l) 

e' Gep e' ^mp 

Note also that line 12 sets: 

/;^+'(i) = o = /;,(i)-/:,(i) 

Hence, we can conclude that = /g(l) — for every incoming edge 

e = fc). 

We construct the total cost function /" as follows. For every edge e = {v, u), 
and every i < n, we set: 

(f'g{i) otherwise. 

Since we have shown that /e^"^(l) = /e(l) " fLi^) ^'^^ every incoming edge 
e = {v, k), we have that /"(I) is a total extension of /""'"^. 

We must now show that /" and / are equivalent. We will do this by showing 
that /" and /' are equivalent. Let s = (si, S2, . . . , s„) be an arbitrarily chosen 
strategy profile. If Sj does not visit k, then we have: 



5^/^'(ne(s)) = 53/:K(s)). 



On the other hand, if Sj does visit k, then it must use exactly one edge {v, u) 
with u = k, and exactly one edge {v, u) with v = k. Therefore, we have: 

E feiMs)) = E /eK(s)) - fLil) + fUW 

= E/:k(s)). 

Therefore, /" is equivalent to /', which also implies that it is equivalent to /. 
Thus, we have found a total extension of P~^^ that is equivalent to /, as required. 

E Proof of Lemma 19 

Proof. Since /" is a partial equivalent cost function, there must exist a cost 
function /' that is an extension of where /' is equivalent to /. We show that 
/' is also an extension of /"+^. 

Let e= {v,d)he an incoming edge at d. Consider line 8 of the algorithm. Note 
that, since /' is equivalent to /, we have Cs(pe) = ^^'epe fe'W- Furthermore, 
since /' is an extension of we have /"/(I) = /e'(l) for every e' G p. Therefore, 
we have: 

/«+i(l)=Cs(pe)^E/e'(l) 

e'ep 

= E/e'(l)-E/e'(l) 

= /:(!)• 

We also have fg~^^{l) = /e(l) for every edge e = (w, u) with u -< d, and we have 
shown that fe~^^{l) = /e(l) for every edge e = {v,u) with u = d. Therefore 
/' is an extension of f"'~^^, which implies that /"+^ is a partial equivalent cost 
function. 



F Proof of Lemma 20 

Proof. Let (V, E) be a graph, and let v,u G V he two vertices. We construct a 

network flow instance where every edge e G E has capacity 1, and we ask for the 
maximum flow between v and u. Since each edge has capacity 1, we have that 
the maximum flow between v and u is greater than 1 if, and only if, there are 
two edge-disjoint paths between v and u. Moreover, by the max-flow min-cut 
theorem, the maximum flow from to m is greater than 1 if and only if there is 
no bridge between v and u. 



G Proof of Lemma 21 



Proof. Note that for each I, there cannot exist a bridge between bi and bi+i. 
Therefore, we can apply Lemma 20 to argue that there must exist two edge- 
disjoint paths between bi and For the same reason, we can find two edge- 
disjoint paths between bm and d. To complete the proof, we simply concatenate 
these paths. □ 



H Proof of Lemma 22 

Proof. We show how pi and ps can be constructed. This splits into two cases, 
and we begin by considering the bridges bj with j > 1. Due to our preprocessing, 
bj and bj-i cannot be dependent. Note that every o-d path that uses must 
also use bj. Therefore, there must exist an o-d path p that uses bj and not bj-i. 
We fix pi to be the prefix of p up to the point where it visits bj. Let pg be 
an arbitrarily selected path from k to bj-i. Note that pi cannot share an edge 
with P3, because otherwise pi would be forced to visit bj-i. 

We now show how pg can be extended to reach bj without intersecting pi. 
Since there are no bridges between bj^i and bj, we can apply Lemma 20 to 
obtain two edge-disjoint paths q and q' from bj-i to bj. If one of these paths 
does not intersect with p\, then we are done. Otherwise suppose, without loss of 
generality, that pi intersects with q before it intersects with q' . We create a path 
p'j that follows pi until the first intersection with q, and follows q after that. 
Since q and q' are disjoint, the paths p'^ and pg^' satisfy the required conditions. 

Now we consider the bridge bi. If fc has at least two incoming edges, then 
we can apply Lemma 20 to find two edge disjoint paths from k to 61, and we 
can easily construct px and pg using these paths. Otherwise, let e be the sole 
incoming edge at k. Since e and 61 are not dependent, we can find a path pi 
from o to b\ which docs not use e, and we can use the same technique as we did 
for J > 1 to find a path pg from k to 61 that does not intersect with pi . □ 



I Proof of Lemma 23 

Proof. It can be verified that the algorithm constructs a partial cost function 
/'*+^ that is an extension of where /""'"^ is defined for every e that is a k- 
bridgc. We must show that f'^'^^ is partially equivalent to /. Since /° is partially 
equivalent to /, there exists some total cost function /' that is an extension of 
such that /' is equivalent to /. We will show that /' is also an extension of 

We will do so inductively. The inductive hypothesis is that fg^^{i + 1) = 
-|- 1) for every e = bi with I > j. The base case, where j = m, is trivial, 
because there arc no fc-bridgcs bi with I > m. Now suppose that we have shown 
the inductive hypothesis for some j. We show that f^+\i + 1) = fi{i + 1). Let 
s be the strategy queried when the algorithm considers bj . 



Consider an edge e e pi. By Lemma 22, we have that ne(s) = 1. By 
assumption, wc have that f"~^^{l) = /"(I) for every edge e, and therefore 
/g''+^(ne(s)) = fl,{ne{s)) for every edge e G pi. 

Now consider an edge e G P4. By Lemma 21, we have that ne(s) = 1 whenever 
e is not a fc-bridge, and we have ne(s) = i + 1 whenever e is a fc-bridge. Therefore, 
by the inductive hypothesis, we have that /g+^(ne(s)) = /g(ne(s)) for every 
e e P4- 

Since /' is equivalent to /, we have that CsiPibiPa) = J2eepibiP3 fe- Therefore, 
line 9 sets: 

fC'i' + 1) = C,(pi6,P4) - Yl /e"+'("e(s)) " ^ Z^+^Kls)) 

eepi eep4 

= E />e(s)) - E /eK(s)) - E /eK(s)) 

eGpibjP4 eGpi eep4 

= 4("e(s)) = 4(z + l). 

Thus, the algorithm correctly sets f^^'^{i + 1) = fb^{i + !)• 



J Proof of Lemma 24 



Proof. Let 61 be the first fc-bridge. By Lemma 20 there exists edge disjoint paths 
from k to bi. The proof can then be completed by applying Lemma 21. □ 



K Proof of Lemma 25 



Proof. It can be verified that the algorithm constructs a partial cost function 
/°+^ that is defined for the correct parameters. We must show that /""'"^ is 
partially equivalent to /. Note that f'^'^^ is an extension of /°. Since /° is 
partially equivalent to /, there exists some total cost function /' that is an 
extension of f", such that /' is equivalent to /. We will show that /' is also an 
extension of f'^'^^. 

Let e = {v, k) be an incoming edge at k. We will show that fg~^^{i + 1) = 
+ 1). Let s be the strategy that the algorithm queries while processing e. 
Since /' is equivalent to /, we have that Cs{pepi) — X^e'epepi /e' (""-e' (s))- For 
every edge e' & pi, we have Ue' (s) = i + 1 • Since every vertex w visited by p 
satisfies p -< k, for every e' e pi we must have /""'"^(rie'Cs)) = /g,(ne'(s)) = 
/e'(^e'(^))- For every edge e' G pi, wc have ne/(s) = 1 if e' is not a fc-bridge, 
and we have ne'(s) = « + 1 if e' is a fc-bridge. In either case, we have that 
fe''^^{ne'{s)) = /^,(ne'(s)) = /^,(ne'(s)) for every edge e' e pi. Therefore, line 9 



sets: 

f:+\i + 1) = csipep,) - J2 r/\n,'i^)) - r/\n,'i.^)) 

e'ep e'epi 

= f'AnAs))-YrAne'{s))-Yf'e'M^)) 

egpepi e'Gp e'Gpi 

= /:(ne(s)) = /:(i + l). 

Therefore, for each incoming edge e = {v, k), we have that /""""^(i + l) = /g(i + l). 
Hence, /' is an extension of /°"'"^, which imphes that /"+^ is partially equivalent 
to /. 



